Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A blackboard approach towards integrated Farsi OCR system

Identifieur interne : 000B00 ( Main/Exploration ); précédent : 000A99; suivant : 000B01

A blackboard approach towards integrated Farsi OCR system

Auteurs : Hossein Khosravi [Iran] ; Ehsanollah Kabir [Iran]

Source :

RBID : Pascal:10-0180822

Descripteurs français

English descriptors

Abstract

An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A blackboard approach towards integrated Farsi OCR system</title>
<author>
<name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">10-0180822</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 10-0180822 INIST</idno>
<idno type="RBID">Pascal:10-0180822</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000193</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000584</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000217</idno>
<idno type="wicri:doubleKey">1433-2833:2009:Khosravi H:a:blackboard:approach</idno>
<idno type="wicri:Area/Main/Merge">000B11</idno>
<idno type="wicri:Area/Main/Curation">000B00</idno>
<idno type="wicri:Area/Main/Exploration">000B00</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A blackboard approach towards integrated Farsi OCR system</title>
<author>
<name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Classification</term>
<term>Information system</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Probabilistic approach</term>
<term>Segmentation</term>
<term>Statistical analysis</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Système information</term>
<term>Classification</term>
<term>Analyse statistique</term>
<term>Approche probabiliste</term>
<term>Extraction forme</term>
<term>Segmentation</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Iran</li>
</country>
</list>
<tree>
<country name="Iran">
<noRegion>
<name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
</noRegion>
<name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000B00 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000B00 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:10-0180822
   |texte=   A blackboard approach towards integrated Farsi OCR system
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024